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Abstract. The low-rank matrix recovery (LMR) is a rank minimization problem subject to 
linear equality constraints, and it arises in many fields such as signal and image processing, sta- 
tistics, computer vision, system identification and control. This class of optimization problems 
is A/'P-hard and a popular approach replaces the rank function with the nuclear norm of the 
matrix variable. In this paper, we extend the concept of s-goodness for a sensing matrix in 
sparse signal recovery (proposed by Juditsky and Nemirovski [Math Program, 2011]) to linear 
transformations in LMR. Then, we give characterizations of s-goodness in the context of LMR. 
Using the two characteristic s-goodness constants, 7 S and j s , of a linear transformation, not 
only do we derive necessary and sufficient conditions for a linear transformation to be s-good, 
but also provide sufficient conditions for exact and stable s-rank matrix recovery via the nuclear 
norm minimization under mild assumptions. Moreover, we give computable upper bounds for 
one of the s-goodness characteristics which leads to verifiable sufficient conditions for exact 
low-rank matrix recovery. 



1. Introduction 

The low-rank matrix recovery (LMR for short) is a rank minimization problem (RMP) with 
linear constraints, or the affine matrix rank minimization problem which is defined as follows: 

(1) minimize rank(X), subject to AX = b, 

where X £ ^ mxn [ s th e matrix variable, and A : W nxn — > W is a linear transformation and 
b 6 M p . Although specific instances can often be solved with specialized algorithms, the LMR 
is ./VP-hard. A popular approach for solving LMR in the systems and control community is 
to minimize the trace of a positive semidefinite matrix variable instead of the rank (see, e.g., 
[2J EB]). A generalization of this approach to non-symmetric matrices introduced by Fazel, 
Hindi and Boyd [T7] is the famous convex relaxation of LMR (pQ) , which is called nuclear norm 
minimization (NNM): 

(2) min s.t. AX = b, 

where \\X\\* is the nuclear norm of X, i.e., the sum of its singular values. When m = n and the 
matrix X := Diag(x),x G M n , is diagonal, the LMR ([I]) reduces to sparse signal recovery (SSR), 
which is the so-called cardinality minimization problem (CMP): 

(3) min ||x||o s.t. Qx = b, 
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where ||z||o denotes the number of nonzero entries in the vector x, $ G R mxn is a sensing matrix. 
A well-known heuristic for SSR is the l\-norm minimization relaxation (basis pursuit problem): 

min \x\\ s.t. <3?x = b, 

where \\x\\i is the i\-norm of x, i.e., the sum of absolute values of its entries. 

The LMR problems have many applications and appeared in the literature of a diverse set 
of fields including signal and image processing, statistics, computer vision, system identification 
and control. For more details, see the recent survey paper [33J. LMR and NNM have been 
the focus of some recent research in optimization community, see, e.g., [U HJ [Til (III E3J EH 
\25\ l26l l32l [33"1 l35| I37j . Although there are many papers dealing with algorithms for NNM 
such as interior-point methods, fixed point and Bregman iterative methods and proximal point 
methods, there are very few papers dealing with the conditions that guarantee the success of the 
low-rank matrix recovery via NNM. For instance, following the program laid out in the work of 
Candes and Tao in compressed sensing (CS, see, e.g., [12 H3 [15]), Recht, Fazel and Parrilo [33] 
provided a certain restricted isometry property (RIP) condition on the linear transformation 
which guarantees the minimum nuclear norm solution is the minimum rank solution. Recht, 
Xu and Hassibi \35\ [34"] gave another condition which characterizes a particular property of the 
null-space of the linear transformation. 

In the setting of CS, there are other characterizations of the sensing matrix, under which 
^i-norm minimization can be guaranteed to yield an optimal solution to SSR, in addition to 
RIP and null-space properties, see, e.g., [TBI H51 UH1 EQ] . In particular, Juditsky and Nemirovski 
[18j established necessary and sufficient conditions for a sensing matrix to be il s-good" to allow 
for exact ^-recovery of sparse signals with s nonzero entries when no measurement noise is 
present. They also demonstrated that these characteristics, although difficult to evaluate, lead 
to verifiable sufficient conditions for exact SSR and to efficiently computable upper bounds on 
those s for which a given sensing matrix is s-good. Furthermore, they established instructive 
links between s-goodness and RIP in the CS context. One may wonder whether we can generalize 
the s-goodness concept to LMR and still maintain many of the nice properties as done in [18] . 
Here, we deal with this issue. Our approach is based on the singular value decomposition (SVD) 
of a matrix and the partition technique generalized from CS. In the next section, following 
Juditsky and Nemirovski's terminology, we propose definitions of s-goodness and G-numbers of 
a linear transformation in LMR. We provide some basic properties of G-numbers. In Section 
3, we characterize s-goodness of a linear transformation in LMR via G-numbers. We establish 
the exact and stable LMR results in Section 4. In Section 5, we show that these characteristics 
lead to verifiable sufficient conditions for exact s-rank matrix recovery and to computable upper 
bounds on those s, for which a given linear transformation is s-good. In Section 6, we consider the 
connection between s-goodness and RIP for a linear transformation in LMR. As a byproduct, 
we obtain the new bound on restricted isometry constant 82s < y/2 — 1. As we were in the 
final stages of the preparation of this paper, Oymak, Mohan, Fazel and Hassibi [31 proposed 
a general technique for translating results from SSR to LMR, where they give the current 
best bound on the restricted isometry constant 82s < 0.472. These results were independently 
obtained. A difference between the results is that we follow Juditsky and Nemirovski's geometric, 
optimization based approach. 

Let W G M mxn ,r := min{m,n} and let W = UUmg{a{W))V T be the SVD of W, where 
U £ M mxr , V E M nxr , and Diag(<r(W)) is the diagonal matrix of a(W) = (ffi(W), . . . , a r {W)f 
which is the vector of the singular values of W . Also let E(W) denote the set of pairs of matrices 
(U,V) in the SVD of W, i.e., 

E(W) := {(U, V) : U G M mxr , V G R nxr , W = UDiag{a{W))V T }. 

For s € {0, 1,2,..., r}, we say W G M. mxn is a s-rank matrix to mean that the rank of W is no 
more than s. For a s-rank matrix W, it is convenient to take W = U mX sW s Vnxs as its SVD where 
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„, . s - ii. , V nxs G M nxs are orthogonal matrices and W s = Diag(((7i(PF), . . . , a s {W)) T ). For 
a vector y G MP, let || • \\d be the dual norm of || • || specified by \\y\\d '■= max„{(t;, y) : \\v\\ < 1}. 
In particular, || • is the dual norm of || • ||i for a vector. Let ||X|| denote the spectral or the 
operator norm of a matrix X G W nxn , i.e., the largest singular value of X. In fact, \\X\\ is the 
dual norm of 11X11*. Let \\X\\ F := ^{X,X) = ^/T^X^X) be the Frobenius norm of X, which 
is equal to the ^2-norm of the vector of its singular values. We denote by X T the transpose of 
X. For a linear transformation A : R mxn — > M p , we denote by A* : W — > M mxn the adjoint of 
A. 

2. Preliminaries 

2.1. Definitions. We first go over some concepts related to s-goodness of the linear transfor- 
mation in LMR (RMP). These are extensions of those given for SSR (CMP) in |18j . 

Definition 2.1. Let A : W nxn — >• MP be a linear transformation and s G {0, 1,2, . . . , r}. We 
say that A is s-good, if for every s-rank matrix W G M mxn , W is the unique optimal solution 
to the optimization problem 

(4) min X6R mxn{||X||* : AX = AW}. 

We denote by s*(A) the largest integer s for which A is s-good. Clearly, s*(^4) G {0, 1, . . . , r}. 
To characterize s-goodness we introduce two useful s-goodness constants: j s and 7 S , we call 7 S 
and 7s G-numbers. 

Definition 2.2. Let A : R mxn — y MP be a linear transformation, (3 G [0, +oo] and s G 
{0, 1,2, ... ,r}. Then, 

(i) G-number r y s {A,j3) is the infimum of 7 > such that for every matrix X G W mxn with 
singular value decomposition X = UrnxsVuxs (i- e -> s nonzero singular values, all equal to 1), 
there exists a vector y 6 MP such that 

(5) ||y|| rf < p and A*y = UDi ag (a(A*y))V T , 

where U = [U mxs U mx u.^ s \],V = [V nxs V^ x ( r _ s )] are orthogonal matrices, and 

crAA y) { 1 G {1,2, . . . ,r|. 

y> \ G[0, 7 ], ^fo- ^ (X)=0, 

If there does not exist such y for some X as above, we set ^ S (A, (3) = +00. 

(ii) G-number ^ S {A, (3) is the infimum 0/7 > such that for every matrix X G W nxn with s 
nonzero singular values, all equal to 1, there exists a vector y €MP such that 

(6) \\v\\d<P and \\A*y-X\\ < 7. 

If there does not exist such y for some X as above, we set r y s (A, /3) = +00 and to be compatible 
with the special case given by [IS] , we write , y s (A), 7 S (~4) instead of ^/ S (A, +00), jsiA, +00), 
respectively. 

From the above definition, we easily see that the set of values that 7 takes is closed. Thus, 
when "f s (A, f3) < +00, for every matrix X G W mxn with s nonzero singular values, all equal to 
1, there exists a vector y G MP such that 

r =1, if<7i(x) = i, 

(7) j,i<M^» rn fAa „ . f (v , n i€{l,2,...,r}. 

[ G [0, 7s (A/3)], if o-i{X) = 0, 

Similarly, for every matrix X G M mxn with s nonzero singular values, all equal to 1, there exists 
a vector y G MP such that 

(8) \\y\\d < (3 and \\A*y - X\\ < %(A, 0). 
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Observing that the set {A*y : \\y\\d < /?} is convex, we obtain that if -y s (A, (3) < +00, then 
for every matrix X with at most s nonzero singular values and ||X|| < 1 there exist vectors y 
satisfying (J7J) and there exist vectors y satisfying ([8]). Moreover, for a given pair A, s, J S (A, (3) = 
7 S (A) and A f s (A, f3) = 7s (.4), for all f3 large enough. However, we would not want (3 to be very 
large in some situations, see Section 4. Thus, we need to work out an answer to the question 
"what is large enough" in our context. Below, we give a simple result in this direction as it was 
done in the vector case, see Proposition 2 in |18j for details. 

Proposition 2.3. Let A : W nxn — > W be a linear transformation and (3 G [0, +00]. Assume 
that for some p > 0, the image of the unit \\ ■ \\*-ball in R mxra under the mapping X 1— > AX 
contains the ball B = {x G W p : \\x\\i < p}. Then for every s G {1,2, . . . , r}, 

/3 > - and j 8 (A) < 1 => 7s(A/3) = 7,(A). 
P 

Proof. Fix s G {1, 2, . . . , r}. Let 7 := ^ S (A) < 1. Then for every matrix W G R mxn with its 
SVD W = UmxsVnxs, there exists a vector y G R p such that 

\\y\\ d < (3 and A*y = UDiag(a(A*y))V T , 
where U = [U mxs U mx ^ r _ s ^],V = [V nxs V nx ^._ s ^] are orthogonal matrices, and 

{ =1, ifo-i(W) = l, 
<Ti(A*y){ i G {1,2,..., r}. 

y> \ G[0, 7 ], Ha i (W) = 0, 

Clearly, \\A*y\\ < 1. That is, 

l>|U*y||= max UX,A*y) : \\X\L < 1} = max {(u,y) : u = AX, \\X\L < 1}. 
xeK mxn xeR mxn 

From the inclusion assumption, we obtain that 

max {(u, y) : u = AX, < 1} > max{(n, y) : \\u\\\ < p} = p\\y\\oo = p\\y\\d- 

Combining the above two strings of relations, we derive the desired conclusion. □ 

2.2. Convexity and monotonicity of G-numbers. In order to characterize the s-goodness 
of a linear transformation A, we study convexity and monotonicity properties of G-numbers. We 
begin with the result that G-numbers r y s (A, (3) and , y s (A, f3) are convex nonincreasing functions 
of (3. 

Proposition 2.4. For every linear transformation A and every s G {0,1,... ,r}, G-numbers 
, y s (A,/3) and ^s(A, f3) are convex nonincreasing functions of j3 G [0, +00]. 

Proof. We only need to demonstrate that the quantity r y s (A, /3) is a convex nonincreasing 
function of {3 G [0, +00]. It is evident from the definition that j s (A, f3) is nonincreasing for given 
A, s. It remains to show that j s (A, (3) is a convex function of f3. In other words, for every pair 
/3i,/?2 G [0, +00], we need to verify that 

7,(4 aft + (l-a)#j) <a^ s (A,Pi) + (l-a) ls (AJ 2 ), Va G [0,1]. 

The above inequality holds immediately if one of f3\,(32 is +00. Thus, we may assume f3\,(32 G 
[0, +00). In fact, from the argument around ([7]) and the definition of 7 s (-4., we know that for 
every matrix X = UY)ia.g(a(X))V T with s nonzero singular values, all equal to 1, there exist 
vectors 2/1,2/2 G W such that for k G {1,2}, 

f =1, if(7i(X) = l, 

(9) \\vk\U < fa and o-i(A*y h ) < [n i G {1, 2, . . . , r}. 

I G 0,7,(4, A) , if <Ti{X) = 0, 



(Ti(A*(ayi + (1 - a)y 2 )) 
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It is immediate from ([9|) that \\ay\ + (1 — a^lld < + (1 — ct)f3 2 . Moreover, from the above 
information on the singular values of A*y\,A*y2, we may set A*yt = X + Yk, k G {1,2} such 
that 

X T Y k = 0,Xlf = 0, rank(y fc ) < r - s, and ||Y fc [| < y.(A,P k ). 
This implies for every a G [0,1], 

X T [aY! + (1 - a)Y 2 ] = 0,X[aY x + (1 - a)Y 2 ] T = 0, 

and hence rank \aY\ + (1 — a)Y 2 \ < r — s, X and [ctYx + (1 — a)Y 2 ] share the same orthogonal 
row and column spaces. Thus, noting that A* [ayi + (1 — a)y 2 ] = X + aY% + (1 — a)Y 2 , we 
obtain that \\ay% + (1 — a)y 2 \\d < a/3\ + (1 — a)f3 2 and 

1, if(7 i (X) = l, 

_ (TiiaYt + (1 - a)Y 2 ), if a^X) = 0, 

for every a G [0, 1]. Combining this with the fact 

+ (1 - a)Y 2 \\ < a \\Y4 + (1 - a)\\Y 2 \\ < a ls (A^i) + (1 - a) ls {A,fo), 

we obtain the desired conclusion. □ 
The following observation that G- numbers r y s {A, /?), J S (A, (3) are nondecreasing in s is imme- 
diate. 

Proposition 2.5. For every s' < s, we have j s i(A,(3) < j s (A,f3), %'(A,/3) < / y s (A,/3). 

We further investigate the relationship between the G- numbers y s (A, /3) and j s (A,/3). The 
following result generalizes the second part of Theorem 1 of |18| (and its proof). 

Proposition 2.6. Let A : W nxn — > MP be a linear transformation, f3 G [0, +oo] and s G 
{0, 1, 2, . . . , r}. Then we have 

(10) 7 := 7s (A/3)<l => % (a, j-^—P) = 73— < \] 

\ 1+7/ 1+7 l 

(11) i--=%{AP)<\ ^ ls f A J_p\ = ^L_ <h 

2 V 1-7 / 1-7 

Proof. Let 7 := 7 S (^4, /?) < 1. Then, for every matrix Z G R mxn with s nonzero singular 
values, all equal to 1, there exists y G MP, \\y\\* < /3, such that A*y = Z + W , where ||W|| < 7 
and W and Z share the same orthogonal row and column spaces. For a given pair Z, y as above, 
take y := j^y- Then we have ||y||* < and 

\\A*y — Z\\ < max 1 1 



1 + 7 I + 7J 1 + 7 

where the first term under the maximum comes from the fact that A*y and Z agree on the 
subspace corresponding to the nonzero singular values of Z. Therefore, we obtain 

(12) ^(a,-^—p)<-^-<1. 



I + 7' J ~ I + 7 2 

Now, we assume that 7 := A / S (A,P) < 1/2. Fix orthogonal matrices U G M mxr , V G M™ xr . For 
an s-element subset J of the index set {1, 2, . . . , r}, we define a set Sj with respect to orthogonal 
matrices U, V as 



Sj := \x £M r :3y £ R p , \\y\\ d < f3, A*y = UDi&g(a(A*y))V T where Oi{A*y) 



= \xi\, if i G J, 
< 7, if % G J. 



In the above, J denotes the complement of J. It is immediately seen that Sj is a closed convex 
set in MP . As in the proof of Theorem 1 in [18] , we have 
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Claim 1. Sj contains the \\ ■ W^-ball of radius (1 — 7) centered at the origin in W . 

Proof. Note that Sj is closed and convex. Moreover, Sj is the direct sum of its projections onto 
the pair of subspaces 

Lj := {x G M r : Xj = 0, i G J} and its orthogonal complement Lj = {x G IR r : Xj = 0, i G J}. 

Let Q denote the projection of Sj onto Lj. Then, Q is closed and convex (because of the direct 
sum property above and the fact that Sj is closed and convex). Note that Lj can be naturally 
identified with K s , and our claim is the image Q C M s of Q under this identification contains 
the || • Hoc-ball B s of radius (1 — 7) centered at the origin in M s . For a contradiction, suppose 
B s is not contained in Q. Then there exists v G B s \ Q. Since Q is closed and convex, by a 
separating hyperplane theorem, there exists a vector u G M s , ||m||i = 1 such that 

u T v > u T v' for every d'gQ. 

Let z G W be defined by 

'l, i£ J, 
0, otherwise. 

By definition of 7 = ^ s {A,f3), for s-rank matrix UY)'mg{z)V T , there exists y G M. p such that 
||y|| d < (3 and 

= UBiag(z)V T + W, 

where W and UT)i&g(z)V T have the same row and column spaces, ||-4*y — Diag(z)|| < 7 and 
||cr(„4*y) — z||oo < 7- Together with the definitions of Sj and Q, this means that Q contains a 
vector v with \v{ — sign(uj)| < 7, Vi G {1, 2, . . . , s}. Therefore, 



T - 

u v 



> J2|«i|(l-7) = (l-7)IMIl = 1-7- 



8=1 



By v €z B s and the definition of u, we obtain 

MM || || || || r p T — 

1 — 7 > H^Hoo = 1 1 It 1 1 1 \\V Woo > U V > U V > 1 — 7, 

where the strict inequality follows from the facts that v G Q and u separates v from Q. The 
above string of inequalities is a contradiction, and hence the desired claim holds. 

Using the above claim, we conclude that for every J C {1, 2, . . . , r} with cardinality s, there 
exists an x G Sj such that X{ = (1 — 7), Vi G J. From the definition of Sj, we obtain that there 
exists y G M p with ||y||d < (1 — 7) _1 /3 such that 

= Vmag(*(A*y))V T , 

where o~i{A*y) = (1 — / y)~ 1 Xi = 1 if i G J, and <7i(„4*y)j < (1 — 7)^7 if i G J. Thus, we obtain 
that 

(13) 7. == %{A P)<\^ls (A, j-^P) < -r-r < 1. 

2 V I-7 J I-7 

To conclude the proof, we need to prove that the inequalities we established: 
% (A, -r^—p) < -1— and ls ( A, -r^-rp) < 



1+7 / 1+7 V 1— 7 / 1+7 

are both equations. This is straightforward by an argument similar to the one in the proof of 
Theorem 1 in [18J. We omit it for the sake of brevity. □ 
We end this section by giving an equivalent representation of the G- number / y s (A,/3). The 
next result generalizes Theorem 2 of |18] (and its proof). We define a compact convex set first: 

P s := {Z G R mxn : \\Z\\* < s, \\Z\\ < 1}. 
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Theorem 2.7. Let A be a linear transformation, f3 £ [0, +00] and s G {0, 1, . . . , r}. Also let P s 
be as defined above. Then, 

(14) l(A,(3) = m&x{(Z,X) - P\\AX\\ :ZeP s , ||X||* < 1}. 

z ,x 

Moreover, 

(15) j(A) = max{(Z,X) : Z £ P s , ||X||* <1,AX = 0}. 

z \x 

Proof. Let Bp := {y £ W : \\y\\ d < (3} and B := {X G R mxn : \\X\\ < 1}. By definition, 
7s(.A, /3) is the smallest 7 such that the closed convex set C 7j/ g := A* Bp + 7.B contains all 
matrices with s nonzero singular values, all equal to 1. Equivalently, C 7i/ g contains the convex 
hull of these matrices, namely, P s . Note that 7 satisfies the inclusion P s C C 7i( g if and only if 
for every X £ R mxn , 

max(Z,X) < max (Y,X) = max {(X, A*y) + y(X, W) : \\y\\ d < p, \\W\\ < 1} 
z&p s y GC 7: ^ ' ' 3/eMP,VKeR mxn 

(16) = /3||yUf||+7||*||*- 

For the above, we adopt the convention that whenever /3 = +00, /3||„4X|| is defined to be 
+00 or depending on whether ||^4X|| > or ||-4JT|| = 0. Thus, P s C C 7i/ 3 if and only if 
max^ 6 p s {(Z, X) — f3\\AX\\} < 7||X||*. Using the homogeneity of this last relation with respect 
to X, the above is equivalent to 

max{(Z,X) - p\\AX\\ :Z£P S , \\X\\, < 1} < 7- 
z,x 

Therefore, the desired conclusion holds. □ 

3. S-GOODNESS AND G-NUMBERS 

We first give the following characterization result of s-goodness of a linear transformation A 
via the G- number 7 S („4), which explains the importance of 7 s (-4.) in LMR. In the case of SSR, 
it reduces to Theorem l(i) in |18|. 



Theorem 3.1. Let A : W nxn MP be a linear transformation, and s be an integer s £ 
{0, 1,2, .. . ,r}. Then A is s-good if and only if "f s (A) < 1. 

Proof. Suppose A is s-good. Let W £ M. mxn be a matrix of rank s £ {1,2, ... , r}. Without 
loss of generality, let W = U mxs W s V^ xs be its SVD where U mxs £ R mxs ,V nxs £ R nxs are 
orthogonal matrices and W s = Diag((a\(W), . . . , a s (W)) T ). By the definition of s-goodness of 
A, W is the unique solution to the optimization problem (j4|). Using the first order optimality 
conditions, we obtain that there exists y £ R p such that the function f y {x) = \\X\\* — y T [AX — 
AW] attains its minimum value over X £ R mxn at X = W. So, G df y (W), or A*y £ d\\W\\*. 
Using the fact (see, e.g., [38] ) 

9||W||* = {U mxs V^ xs + M : W and M have orthogonal row and column spaces, and ||M|| < 1}, 

it follows that there exist matrices P mx ( r - s ))Kix(r-s) such that A*y = UT)\a,g{ai{A*y))V T 
where U = [U mxs U mx r r _ s \], V = [V nxs V nx r r ^ s \] are orthogonal matrices and 

= 1, if i £ J, 

£[0,1], ifieJ, 

where J := {i : Oi(W) / 0} and J := {1, 2, . . . , r} \ J. Therefore, the optimal objective value of 
the optimization problem 

r r =1, inej, ) 

(17) mm\ 1 :A*y£d\\W\U,a l (A*y)\ 

w.7 1 I G [0,7], if i£ J, J 
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is at most one. For the given W with its SVD W = U mxs W s V^ xs , let 

n := conv{M G R mxn : the SVD ofM is M = [U mxs U mx(r _ s) ] ( Q S (j( ° f) ) [V nXs V nx(r _ s) ] T }. 

It is easy to see that II is a subspace and its normal cone (in the sense of variational analysis, 
see, e.g., [36J for details) is specified by IT- 1 -. Thus, the above problem (fT7|) is equivalent to the 
following convex optimization problem with set constraint 

(18) min{||M|| : A*y - U mxs V^ xs - M = 0,M Gil}. 

y,M 

We will show that the optimal value is less than 1. For a contradiction, suppose that the optimal 
value is one. Then, by Theorem 10.1 and Exercise 10.52 in [36], there exist Lagrange multiplier 

D £p xn such that the f unction 

L(y, M) = \\M\\ + (D, A*y - U mxs V^ xs - M) + S n (M) 

has unconstrained minimum in y, M equal to 1, where <5n(") is the indicator function of II. Let 
y*,M* be an optimal solution. Then, by the optimality condition G dL, we obtain that 

G d y L(y*,M*), and G d M L(y*,M*). 

Direct calculation yields that 

AD = 0, and G -D + d\\M* || + IT 1 . 

Notice that Corollary 6.4 in [22] implies that for every C G <9||M*||, C G II and ||C||* < 1. Then 
there exist Dj G IT 1 and Dj G 9||M*|| C II such that D = Dj + Dj with \\Dj\\* < 1. Therefore, 
( J D,?7 mxs K T x s ) = (Dj,U mxs V^ xs ) and (I>,M*) = (Dj,M*). Moreover, (Dj,M*) < \\M*\\ by 
the definition of the dual norm of || • ||. This together with the facts AD = 0, Dj G Tl 1 - and 
Dj G (9||M*|| C II yields 

L(y*,M*) = \\M*\\-(Dj,M*) + (D,A*y*}-(Dj,U mxs V^ xs )+5n(M*) 
> -(Dj,UmxsV^ xs ) + 5 u (M*). 

Thus, the minimum value of L(y,M) is attained, L(y*,M*) = —{Dj,U mxs V^ xs ), when M* G 

II, (Dj, M*) = \\M*\\. Weobtainthat \\Dj\\* = 1. By assumption, 1 = L(y*,M*) = -(Dj,U mxs V^ xs ). 

That is, J2i=i( U mxs DV nxs)u = -1- Without loss of generality, let SVD of the optimal M* be 

M * = ^ ( 0* a(M*) ) where ^ := Prnxs U mx(r _ s )] and V := [V nxs V nX (r-s)]- From tne 
above arguments, we obtain that 

i) AD = 0, 

ii) E:=i(C s ^x s )h = ^j{U T DV) tt = -1, 

iii) EieAU T DV) u = 1. 

Clearly, for every t G R, the matrices Xt := W + tD are feasible in @. Note that 

= U mxs W s Vj[ xs = [U mxs U mx ( r ~s)] ( ^ q J [Kxs Kx(r-s)] T - 



ill* 



Then, ||W||* = ||J7 WV"||* = Tr(C/ T W). From the above equations, we obtain that \\X- 
\\W\\* for all small enough t > (since v,i(W) > 0, i £ {1,2,... ,s}). Noting that W is the 
unique optimal solution to (j4|), we have Xt = W, which means that {U T DV)ii = for i G J. 
This is a contradiction, and hence the desired conclusion holds. 

We next prove that A is s-good if 7 S (»4) < 1. That is, we let W be an s-rank matrix and 
we show that W is the unique optimal solution to ([4]). Without loss of generality, let W be 
a matrix of rank s' ^ and U mxs > W s ^ xs , be its SVD, where U mxs > G M. mxs ' ,V nxs , G M. nxs ' 
are orthogonal matrices and W a > = Diag((o"i(VF), . . . , cj s i(W)) t ). It follows from Proposition 
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that 7 S ' (A) < 7 S < 1. By the definition of 7 S (.4), there exists y G MP such that A*y 
UBiag(a(A*y))V T , where U = [U mxs > f7 mX ( r _ s /)], V = [V nXs > V nx (r- S ')\ 



<?i{A*y) 

The function 



= 1, if<7;(W)^0, 

G[0,1), if 0*0*0 = 0. 



f{X) = \\X\U-y T [AX -AW] = \\X\\, - (A*y,X) + 

becomes the objective function of (|4|) on the feasible set of (|4|). Note that (A*y, X) < \\X\\* by 
||^4*y|| < 1 and the definition of dual norm. So, f{X) > ||X||* — \\X\\* + ||W||* = \\W\\* and 
this function attains its unconstrained minimum in X at X = W. Hence X = W is an optimal 
solution to @. It remains to show that this optimal solution is unique. Let Z be another 
optimal solution to the problem. Then f(Z) - f(W) = ||Z||* - y T AZ = - (A*y, Z) = 0. 
This together with the fact < 1 imply that there exist SVDs for A*y and Z such that: 

A*y = UBiag{a{A*y))V T 1 Z = UDiag{cj(Z))V T , 

where U G R mxr and V G M nxr are orthogonal matrices, and (Ti{Z) = if cr^ (^4*2/) ^ 1. Thus, 
for = 0, Vi G -{V + 1, . . . ,r}, we must have Oi{Z) = (Ji(W) = 0. By the two forms 

of SVDs of A*y as above, U mxs 'V^ xs , = U mxs >V^ xs , where U mxs ^V^ xs , are the corresponding 
submatrices of U, V, respectively. Without loss of generality, let 

U = [ui,« 2) . ■ ■ ,u r ], V = [vi,v 2 , ■ ■ ■ ,v r ] and U = [ut, u 2 , ■ ■ ■ ,«r], V = [vi, v 2 , ■ ■ ■ , v r ], 

where Uj = Uj and Vj = Vj for the corresponding index j G {i : Ui{A*y) = 0, i G {V + 1, . . . , r}}. 
Then we have 

Z = (Ti(Z)uivf, W = Y, °i(W) Ul vJ. 

i=l i=l 

From U mxs iV^ xs , = U mxs 'V^ xs ,, we obtain that 

r r 

^2 Vi{A*y)uivf = ^2 <?i{A*y)uivJ . 

i=s'+l i=s'+l 

Therefore, we deduce 



<Ti{A*y)uivf + 

i=s'+l,ai(A*y)^0 i=s'+l,a i (A*y)=0 



UjV; 



T 



= Y Oi(A*y)uivf + Y 

i=s'+l,ai(A*y)^0 i=s'+l,ai(A*y)=0 

= : n. 

Clearly, the rank of f2 is no less than r — s' > r — s. From the orthogonality property of U, V 
and U,V, we easily derive that 

VFuivJ = 0, n T Ui vf = 0, for all i € {1, 2, ... , s'}. 

Thus, we obtain Q T (Z — W) = 0, which implies that the rank of the matrix Z — W is no more 
than s. Since ^ S {A) < 1, there exists y such that 

= 1, if ai(Z-W)^Q, 

G [0,1), if ffi(Z- W) = Q. 

Therefore, = y T A{Z -W) = {A*y, Z-W) = \\Z- W\\*. Then Z = W. □ 



°i{A*y) 
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For the G- number j s (A), we directly obtain the following equivalent theorem of s-goodness 
from Proposition 12.61 and Theorem 13. 1L 

Theorem 3.2. Let A : W riXn MP be a linear transformation, and s G {1,2, ... ,r}. Then A 
is s-good if and only if A f s (A) < 1/2. 

For X G IR mxn , we define the sum of the s largest singular values of X as 

IIXIL* := max(Z, X). 

We immediately obtain the following result utilizing Proposition 12.61 and Theorem 13.21 

Corollary 3.3. Let A : M mxn — y M p be a linear transformation, and s G {1,2,... , r}. Then 
7s(-4.) is the best upper bound on the norm \\X\\ Stif of matrices X G Null(^4) such that < 1. 

As a result, the linear transformation A is s-good if and only if the maximum of \\ ■ || S) * -norms 
of matrices X G Null(„4) with = 1 is less than 1/2. 

4. Exact and stable recovery via G-number 

In the previous sections, we showed that G-numbers "f s (A) and ^siA) are responsible for 
s-goodness of a linear transformation A. Observe that the definition of s-goodness of a linear 
transformation A indicates that whenever the observation b in the following 

(19) W G argmin x {||X||* : \\AX - b\\ < e} 

is exact (noiseless) and comes from a s-rank matrix W such that b = AW, W is the unique 
optimal solution of the above optimization problem (|19p where e is set to 0. This establishes a 
sufficient condition for the precise LMR of an s-rank matrix W in the "ideal case" when there 
is no measurement error or noise and the optimization problem (jlj) is solved exactly. 

Theorem 4.1. Let A : M mxn MP be a linear transformation, and s G {1,2, .. . ,r}. Let W 
be a s-rank matrix such that AW = b. If A is s-good ( / y s (A) < 1/2, or ^y s (A) < 1), then W 
is the unique solution to LMR (Qp, i.e., the solution to LMR |7]) can be exactly recovered from 
Problem Q). 

Proof. By the definition of s-goodness of a linear transformation A, the assumptions that 
AW = b and rank(VF) < s imply that W is the unique solution to problem ([4]). It remains to 
show that W is the unique solution to problem (pQ). For a contradiction, suppose there is an 
another solution Y to problem ([T|). Then AW = AY = b. By the s-goodness of A, the problem 
min{||X||* : AX = AW} w min{||X||* : AX = AY} has a unique solution, hence Y = W and 
we reached a contradiction. □ 

It turns out that the same quantities 7s (^4) (^ S (A)) can be used to measure the error of 
low-rank matrix recovery in the case when the matrix W G M mxn is not s-rank and the prob- 
lem (|3|) is not solved exactly. In what follows, let W = UDiag(a(W))V T , where a(W) = 
(a\(W), . . . ,a r (W)) T and o-\(W) > ... > cr r (W) > are the singular values of W in nonin- 
creasing order. Let W s := UDiag((ai(W), . . . , cr s (W), 0, . . . , 0) T )V T . Clearly, in terms of nu- 
clear norm, W s stands for the best s-rank approximation of W . In order to establish the error 
bound in the "non-ideal case", we also need the following assumption for a matrix X G M mxn : 

Block Assumption: We say that X satisfies the block assumption with respect to W if there 
exists (U, V) G H(VF) such that U T XV has the block form as 

^ = (*' I 

where X x G M sxs and X 2 G M^ r -^ x ( r - s \ In this case, we write X^ := U ( ^ V 1 
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Theorem 4.2. Let A : R mxn — y MP be a linear transformation, s £ {0,1,2, ... , r}, and %(A) < 
1/2 (or, equivalently, j s (A) < 1). Also let W be a matrix such that AW = b. Let X be a 
v-optimal solution to the problem |^]), meaning that 

AX = AW and 11X11, < Opt(AW) + v, 

where Opt{AW) is the optimal value of Q). If the Block Assumption holds for X, then 

Proof. Set Z:=X-W. Let D x := Diag((<7i(W), . . . , a s {W)) T ), D 2 := Diag((a s+1 (W), . . . ,a r (W)) T ) 
Using the assumptions, we obtain that Z has the form 



T 



Define 

z (s) . = tj ( Xx-Di 
\ 

It is easy to verify that ZW = - W s and ||ZW||„ < ||Z|| Sj *. Along with the fact AZ = 
and Corollary 13. 2>\ this yields 

(20) ||Z (s) ||* < \\Z\\ s ^<^ s {A)\\Z\l. 

On the other hand, W is a. feasible solution to fl3J), so Opt(.4W) < ||W||*. Thus, we have 

\\W\U+v>\\W + Z\\* > \\w s + z-z^\\* - \\z^ + w -w s \u 

(21) = \\w s \u + \\z-z^\\*-\\z^\u-\\w-w%, 

where the last equation follows from the facts that W S (Z — Z^) T = = (W — W S )(Z^) T and 
(W S ) T {Z - ZW) = = (W - W S ) T Z^ S \ and Lemma 2.3 in [33]. This is equivalent to 

\\Z-Z^*\ < \\Z^ S \ + 2\\W -W s \\*+v. 

Therefore, we obtain 

||Z||* < + \\Z- Z^\\* < 2||ZW||* + 2\\W- W s \\* +v 

< 2%(A)\\Z\*+2\\W -W s \\* + v. 

Since 7 S (»4) < 1/2, we reach the desired conclusion. □ 
Notice that the above Block Assumption holds naturally in the SSR (CMP) context. In 
general, we may have 

u T xv - ( Xl Xs 



T 



A4 X2 

where either A3 or A4 is not zero. In this case, we have 



Z ' 1 A 4 A 2 - D 2 1 1 



If we define 



we cannot conclude (f2"Tj) . If we define 

we cannot conclude IjZ^H* < ||Z[| Si *. It is not difficult to give counterexamples to illustrate the 
above facts. Meanwhile, in the last two cases, the rank of Z^ s > may be greater than s. Thus the 
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condition A f s (A) < 1/2 is not sufficient, and hence we need more strict restrictions on the linear 
transformation A. 

Below, we consider approximate solutions X to the problem 

(22) OptCb) = min {\\X\L : \\AX - b\\ < e} 

xeM mxn 

where e > and b = AW + £, ( G W with ||£|| < s. We will show that in the "non-ideal case", 
when W is "nearly s-rank" and (|22|) is solved to near-optimality, the error of the LMR via NNM 
can be measured by %(A, 0) with a finite f3. 

Theorem 4.3. Let A : M mxn MP be a linear transformation, and s G {1,2, . . . , r} ; and let 
(3 G [0,+oo] such that 7 := %(A, f3) < 1/2 (or 7 := 7 s (.4,/3/(l - 7)) < lj. lei e > and let W 
and b in \22\) be such that \\AW — b\\ < e, and let W s be defined in the beginning of this section. 
Let X be a ($, t>)- optimal solution to the problem \22\) . meaning that 

\\AX - b\\ < § and ||X||* < Opt{b) + v. 

If the Block Assumption holds for X , then 

2/3(tf + e) + 2||W- W S \U +v 



x-WL < 



1 - 27 



(23) = l^[2P(tf + e) + 2\\W-W s \U+v]. 

1-7 



Proof. Note that W is a feasible solution to (|22|) . Let Z = X — W . As in the proof of 
Theorem 14.21 we obtain that ||ZW|L < ||^|| s * and 



ll^ll* < 2\\Z^\\* + 2\\W - W S \U +v. 
Employing (|14p in Theorem 12.71 we derive 

(24) ||Z|| Si * < P\\AZ\\ +7PII* < P{~& + e) +l\\Z\\*, 

where the last inequality holds by \\AZ\\ = \\AX - b + b - AZ\\ < \\AX - b\\ + \\b - AZ\\. 
Combining with the above inequalities, we obtain 

\\Z\U < 2f3(tf + e) + 2 7 ||Z||* + 2|| W - W s \\, + v. 

Now, the desired conclusion follows from the assumption 7 < 1/2 and 7 = 7/(1 + 7). D 
Theorem 14.31 shows that under the Block Assumption the error bound (123p for imperfect 
low-rank matrix recovery can be bounded in terms of A / S (A, (3), (3, measurement error e, "s-tail" 
|| W — ||* and the accuracy (fljv) to which the estimate solves the program (122p . Note that we 
need 'j s (A, f3) < 1 (or j s (A, (3) < 1/2). However, the "true" necessary and sufficient condition 
for s-goodness is J S (A) < 1 (or ^fsiA) < 1/2). Also, note that j s (A, j3) = 7 S (^4) for all finite 
"large enough" values of {3, see Proposition 12.31 for details. 



5. Computing bounds on the G-number via convex optimization 

We showed that G-number A / S (A, /3) controls some of the fundamental properties of a linear 
transformation A relative to LMR. Since it seems difficult to evaluate these quantities exactly, 
we will provide ways of computing upper and lower bounds on these quantities ; y s (A, f3) via 
convex optimization techniques. 
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5.1. Computing lower bounds on j s (A,f3). Note that A / s (A,f3) > J S {A) for any f3 > by 
Proposition 12.41 Therefore, we may establish a lower bound for G- numbers A / S (A, f3) by giving 
such a bound for / j s (A). We can bound J S {A) from below utilizing Theorem 12.71 Recall von 
Neumann's trace inequality [30] : (Y, Z) < (a(Y),a(Z)) for every pair of matrices Y,Z€ M. mxn , 
where the equality holds when Y, Z share the same orthogonal row and column spaces. In what 
follows, we define 

E(A) := {(U,V) : U G W mxr , V G R nxr , 3W = UT)i&g(a(W))V T , AW = 0}. 
From the representation (|15p . we obtain 

*f(A) = max /(E), /(E) = max{(E,X) : ||X||* <l,AX = 0}. 

It is easy to see that /(E) is convex. Then, we solve the convex optimization problem 

(25) X s G argmax x {(E,X) : <1,AX = 0}, 

we obtain a linear form (X^,©) of G P s which under-estimates /(O) everywhere and agrees 
with /(©) when = E. Notice that 

max x {(E,X) : \\X\\* <1,AX = 0} 

> maxx,(u,V)GHA){{V,X) : ||X||* <1,AX = 0,Z = UDmg(t)V T ,X^ = ?7Diag(^)^ T }. 

Since we need only to focus the lower bound via the above problem (|25|) . in this sense, we may 
set E = UBiag(t)V T by choosing (U,V) G E(A) and t G M r with \\t\li < s, p||oo < 1. Thus, we 
may obtain a lower bound from the following optimization problem: 

{(t,x t ) : \\x t \\i < l,^[t/Diag(x t )^ T ] = 0}. 

For simplicity, we define A by a set of p matrices Ai G W mxn , i G {1,2, ... ,p}: 

A(-) = ({A 1 ,-),{A 2 , ■),..., {A P ,-)) T . 

Thus, we may rewrite 

(26) AXx = Ax t 

where A G W xr with A^ = (U T AiV) j:j . In this sense, we may formulate the convex optimization 
problem (|25ft as the following group of LP problems 

(27) xt G argmaX;j.{(i, x) : \\x\\i < I, Ax = 0}. 

The optimal solutions may not be unique because for a given E orthogonal matrices U G 
M. mxr ,V G M. nxr are usually not unique. In order to establish a lower bound for %(A), we 
may choose one pair (U, V) G E(A) and then solve the corresponding LP ([2?]) . We obtain a 
linear form v T xt of v G A s where 

A s := {x G W : \\x\\i < s, \\x\loo < 1}. 

Therefore, we obtain a lower bound result on jsiA) as follows: 

Proposition 5.1. Let A be specified as above and xt given by fj?7| ). Then, max cg A s v T xt is a 
lower bound on A f s {A). 

Clearly, the above bound is easily computable. As in [18] , we can use the standard sequential 
convex approximation scheme for maximizing the convex function /(•) over P s . In particular, 
we can run the iterative process 

t k+ i G argmax„ gAs i; T xt fe , t x G A s , UT)iag(ti)V T G P s . 

This leads to a monotone nondecreasing sequence of lower bounds t^x tk on 7 S (^4). We may 
choose to terminate this iterative process when the improvement in the bounds falls below a 
given tolerance, and we can start several runs from randomly chosen points t x and orthogonal 
matrices (U,V) G E(A). 
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5.2. Computing upper bounds on A f s {A,(3). For an arbitrary linear transformation B, we 
have 

max{(E,X) : \\X\L < 1,AX = 0, £ G P s ) 

s,x 

(28) = max{(£,X - SMI) : \\X\\* < I, AX = 0, £ G P s }. 

In the same way as in (I26p . we define ,6 by a set of p matrices -B^ G R mxn , k G {1, 2, . . . ,p} and 
B* as 

p 

= ^UkBk, u = {ui,u 2 , . . . ,u p ) T G MP. 
k=l 

For simplicity, suppose (|26p holds. Using a similar analysis, we choose all Bj (simultaneously 
diagonalizable) such that they have the singular value decompositions = UT)\&g{yk)V T (yk G 
M. r ) and then rewrite (|28p as 

max{(S,X -B*AX) : \\X\\* < I, AX = 0, E G P s } 

(29) = max {(t,x - B T Ax) : \\x\\i < I, Ax = 0,t G A s }, 

t,x,U,V 

where B T := [yx, y2, . . . , y p ]. If we fix U, V, the above problem is easy to solve as it was done in 
[T5] . In this case, 

max{(t,x - B T Ax) : \\x\\\ < 1, Ax = 0,t G A s } 



< max{(t, j; - B T Ax) : \\x\\i < 1, t G A s } 



t,.T 



= max {(t, (/ — B T A)ei) : t G A s } 
t,ie{i,...,r} 

(30) = max max{(i, (I - B T A)ei)} = max 11(1 - B T A)ei\\ s i, 

ie{l,...,r} t6A s ie{l,...,r} 

where ||x|| Sj i is the sum of the s largest magnitudes of entries in x. Therefore, we have for all 
B G R pxr 

%(A) = max{(£, X) : < 1, AX = 0, E G P s } 
< max ||(/ - B T A) ei \\ sA =: f A>s {B). 

U,V,i£{l,...,r} 

Taking T S (A, +oo) := min^ fA,s(B), we obtain 

Js(A) < T a (A, +oo). 

Observe that fA,s(B) is an easy-to-compute convex function of B for fixed U, V and it is indeed 
related to a semi-infinite programming [3j. Therefore, one may choose to utilize computational 
semi-infinite programming techniques to compute the quantity r s (^4, +oo). 
The above analysis motivates the following useful function of A and f3. 

Definition 5.2. Let A and the corresponding matrices Ai,i G {1,2, . . . ,p} be given as above. 
Let (3 G [0, +oo]. We define T s (A,f3) as follows: 

(31) T s (A,p) := mini max \\(L - B T A)e l \\^ 1 : \\{B).j\\ d < (3, 1 < j < r ) , 

B {U,V,i£{l,...,r} ) 

where A is the matrix defined by Ai and U,V (as above), (B).j is the jth column of B. If 
there does not exist such a matrix B as above, we take T s (A,f3) = +co. For convenience, we 
abbreviate the notation T S (A, +oo) to T S (A). 
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By modifying the above process, we obtain that T S (A,P) provides an upper bound for G- 
numbers j s (A,P). Moreover, T S (A,P) shares some properties similar to those of G-numbers 
; y s (A,P). In other words, T s (A,f3) is nondecreasing in s, convex and nonincreasing in f3, and is 
such that T S (A,P) = T S (A) for all large enough values of (3. The following result shows that 
T S (A,P) is an upper bound on j s (A, P). 

Theorem 5.3. For every A and (3 G [0, +oo], we have T S {A,P) > / y s (A,P). 

Proof. Let W be a s-rank matrix with all nonzero singular values equal to 1 such that 

W = U ( q S |j ^ V T , where I s is the s x s identity matrix. For U, V, we get AW = Aa(W) 

where A is specified as in (f26|) . Let Y = [y\, j/2, • • • , y r ] G W xr be such that \\yi\\d < P and 
the columns in I — Y T A are of the || • || S) i-norm not exceeding F S (A,(3). Define the linear 
transformation B such that BW := Ya(W). Setting y = Ya(W), the fact that \\yi\\d < (3, 
i G {l,2,...,r} implies that ||y||* < /3||cr(VK) || i < (3s. Furthermore, noting that a(W) is a 
s-sparse vector, we obtain 

\\W -A*y\\ = \\W -A*BW\\ = ||(J - B T A) T a{W)\\ < T a (A,p). 

The desired conclusion follows immediately. □ 
Note that H-STH^* < s]|^T||t,* for all positive integers s,t. Thus, we may replace T S (A,(3) as 
sriG4,£),i.e., 

%(A > j3)<r s (A,/3)<sT 1 (A,P). 
Moreover, we have Ti(A,(3) = maxj Tj, where 

(32) Ti := min{||ei - A T y i \\ 00 : \\yi\\ d < /?}, i G {1,2,. . . ,r}. 

U,V,yi 

By direct calculation, note that the matrix A is the representation of A with respect to U, V, 
we obtain 

Ti = min max{|(ei - A T y)j\ : \\y\\ d < (3} 

U,V,y ] 

= min max{(ej - A T y,x) : \\y\\ d < (3, \\x\\i < 1} 

U,V,y x 

= maxmin{(C/eiV/ T ,X) - (A*y,X) : \\y\\ d < /3, IIXIL < 1,X = lTDiag(x)V T } 
X y 

= max{(UeiV T ,X) - P\\AX\\ : ||X||* < 1}. 

It follows from Theorem 12.71 that Tj < 71 (^4, f3). Therefore, by Theorem 15.31 we have that the 
relaxation for ji(A, f3) is exact, i.e., 

(33) Y X {A,P)=^ X {AP). 

As in Proposition 12.31 we present the following simple result which shows how large /3 needs 
to be to guarantee T S (A, (3) = T S (A). 

Proposition 5.4. Let A : R mxn — )• MP be a linear transformation, (3 G [0, +00] and s G 
{0, 1,2,... , r}. For some p > 0, let the image of the unit \\ ■ \\*-ball in R mxn under the mapping 
X 1 — y AX contain the ball B = {x G MP : \\x\\i < p}. Then for every s < r 

p>± and r s {A) < \ r s (A,/3) = r s (A). 

2p 2 

Proof. Fix s G {1,2, . . . ,r}. Let T S (A) < 1/2. Then 7 := 7 S (^4) < 1/2 and hence for every 
matrix W G M. mxn with s nonzero singular values, equal to 1, there exists a vector y G MP such 
that 

\\y\\ d < p and \\A*y-X\\ < 7. 
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By the triangle inequality, ||»4*y|| < 1 + 7 < 3/2. Following the same steps as in the proof of 
Proposition 12.31 we reach the desired conclusion. □ 

6. S'-GOODNESS AND RIP 

We consider the connection between restricted isometry property and s-goodness of the linear 
transformation in LMR and present some explicit forms of restricted isometry (RI) constants 
and s-goodness constants, G- numbers. Recall that the s-restricted isometry constant 5 S of a 
linear transformation A is defined as the smallest constant such that the following holds for all 
s-rank matrices X e M. mxn 

(34) (1 - 6 S )\\X\\ 2 F < \\AXg < (1 + 6 S )\\X\\ 2 F . 

In this case, we say A possesses the RI(5 S ) -property (RIP) as in the CS context. For details, see 
[T0| l2~Tj 127} [29] and the references therein. 



6.1. J S (A) and 62s- We will show that the RI(<52 S )-property of A implies that G-numbers satisfy 
j s (A) < 1/2 and 7 s (-4.) < 1, which means that the RIP implies the sufficient conditions for s- 
goodness. 

Theorem 6.1. Let A : R mxn — )■ M. p be a linear transformation, and s E {1,2, . . . , r}. Assume 
that A has RIP with 62s < V% — 1, and let \\ ■ ||^ := || • || 2 for vectors in W . Then we have 



(35) UAP)< ^ <iforall/3> + ^ 



i + (>/2-i)&. 2 i + (V2-i)«y 2a " 

This implies 

(36) UA) < —7^^ < \ and ls (A) < < 1, 

l + (V2-l)5 2 s 2 l-fe 

and hence A is s-good. 

Proof. By Theorem 12.71 in order to show (|35p . it is enough to verify that for all X £ 



(37) < + {{Axh + ^ \\X\U. 
' l + (V2-l)5 2s l + (V2-l)5 2s 

Without loss of generality, let SVD of X be specified by 

X = UDi&g(x)V T , 

where U G ]^ m x r anc [ y £ M nxr 5 and o~{X) := x = (x\, . . . ,x r ) T is the vector of the singular 
values of X with x\ > ■ ■ ■ > x r > 0. We decompose x into a sum of vectors xt v i G {0, 1, 2, . . .}, 
each of sparsity at most s, where To corresponds to the locations of the s largest entries of X, 
and Ti to the locations of the next s largest entries, and so on (with except for the last part). 
We define X^ i := [/Diag(xTjy T . Then, Xt is the part of X corresponding to the s largest 
singular values, Xt x is the part corresponding to the next s largest singular values, and so on. 
Clearly, Xt , Xt x , • • • , Xt { , • • • are all orthogonal to one another, and rank(XrJ < s. From the 
above partition, we easily obtain that for j > 2, 

\\x T .\\ F < s V 2 \\x T .\\ < s-^wxt^w*. 

Then it follows that 

£||*r> < a- l/a £HX0_J. < ^ 1/2 (ll*ll* - \\X To \U). 

i>2 j>2 

This yields 

(38) \\X - X To - X Tl \\ F = ||^Xr> < J2\\ X ^Wf ^ s_1/2 (ll^ll* " II^Toll.)- 

i>2 j>2 
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Noting that A(X To + X Tl ) = A(X - £\ >2 X Tj ), we obtain 

\\A(X To +X Tl )\\l = (A(X To +X Ti ),A(X-Y,Xt 3 )) 

i>2 

= (A(X To + X Tl ),AX) - Y,(AXt + X Tl ),AX T . ) . 

i>2 

From the RIP assumption of A, we obtain that 

\{A(X To +X Tl ),AX)\ < \\A(X To + X Tl )\\ 2 \\AX\\ 2 



< \A + 5 2 s\\X To + X Tl \\ F \\AX\\ 2 . 

By direct calculation, 

Y,\(AX To +X Tl ),AX Tj ))\ < J2 S ^\\XT \\F + \\X Tl \\ F )\\X Tj \\ F 

j>2 j>2 

< V25 2s \\X To +X Tl \\ F ^2\\X T] \\ F , 

where the first inequality follows from Lemma 3.3 |10| . and the second one follows from the 
inequality (||Xt ||f + II^TiIIf) 2 < 2||Xt -fA^H^,. Clearly, combining the RIP assumption on 
A with the above inequalities, we have 

(1 - 5 2s ) \\X To + X Tl HI < (A(X To + X Tl ),A(X To + X Tl )) 

< v 7 ! + S 2s \\X To + X Tl \\ F \\AX\\ 2 + V26 2s \\Xt + X Tl \\f ^ \\X t . \\ f . 

i>2 

This implies 



(1 - 5 2s )\\X To + X Tl \\ F < y/1 + S 2s \\AX\\ 2 + V25 2s \\Xtj \\f- 

i>2 

By §8$) and the fact ||X To ||* < ^/s\\X To \\ F < ^fs\\X To + X Tl \\ F , it follows that 



\\x To \U < V f + / 2s) \\Ax h + ^L(\\X\\. - \\x To \U). 

1 - o 2s 1 - d 2s 

Noting that ||^r 1|* = ||-^1U,*> we establish (|37|) . and hence we obtain the desired conclusion. □ 

6.2. T S (A) and 5 2s . We consider the performance of T S (A) for s-goodness when A has RIP. It 
turns out that this is similar to the CS case. 

Theorem 6.2. Let A : R mxn — y M. p be a linear transformation, and s E {1,2, . . . , r}. Assume 
that A has RIP with 5t s < 1 for some positive constant t. Then we have 

(39) T X {A) < V ' 2 '^ 



(1 - 6ts)y/tS=T 



Furthermore, if s < (1 ~^^ T , then T S (A) < sT^A) < 1/2. 

Proof. From Theorem 15.31 in order to establish the desired theorem, we only need to prove 
(j39|) . By Theorem O and (|53"]) . it is enough to show that for every I e l mx " with AX = 0, 
we have 

(40) ||X|| = ||X|| M < 7 ||X||*, 7 := fr(A) < 

(1 - dts)Vts - 1 

As in the proof of Theorem 16. 1\ let SVD of X be specified by 

X = UBiag{x)V T , 
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where U 6 ]$ rnxr anc l y g M nxr j and o~(X) := x = (x\, . . . ,x r ) T is the vector of the singular 
values of X with x\ > ■ ■ ■ > x r > 0. Set / = [ts/2\. We decompose x into a sum of vectors 
Xx i ,i G {0,1,2,...}, where Tq corresponds to the locations of the largest entries of X, T\ to 
the locations of the next I — 1 largest entries, and Tj(j > 2) to the locations of the next I 
largest entries, and so on, with evident modification for the last vector. We define Xt { '■= 
UDi&g(xTi)V T . Then, Xt is the part of X corresponding to the largest singular values, Xt x 
is the part corresponding to the next I — 1 largest singular values, and Xy. (J > 2) is the part 
corresponding to the next I largest singular values, and so on. From the above partition, we 
easily obtain that for j > 2, 

Then it follows that 

E n^, y z 1 1/2 E ii x ^ ii* ^ 1 1/2 (ii x n* - n^o no- 

i>2 j>2 

This yields 

\\x-x To -x Tl \\ F = IIE^IIf <Y,W x Tj\\F ^ rl/2 (ll^ll* - II^Toll*) < r 1/2 \\x\\*- 

i>i i>2 
Together with AX = and Lemma 3.3 |10j . we obtain 

= (A(X To +X Tl ),AX) 

= (A(X To +X Tl ), A(X To +X Tl )) + (A(X To +X Tl ), A(X - X To - X Tl )) 

> (i - s^wxt, + x Ti \\f - r 1/2 s 2 i\\x\u. 

This implies 

{I - dOWXr, + X Tl \\ F < H6ts\\X\\*. 

Note the facts that ||X||i,* = 1 1 -Xxb 1 1 * — II^ToHf < \\Xt + -^TiH-F an d Si < 621 < St s because of 
I < is/2. We then have 

(1 -Si)\\X To +X Ti \\f < (1 -Sts)\\X To + X Ti \\f < \[?-6t,\\X\\* < J T?—S ts \\X\\*. 

V o ts y ots-i 

This proves (|40p and hence the desired conclusion holds. □ 

6.3. A bound for RIP. From Theorems 13.21 and 16. 11 we actually provide a sufficient condition 
for s-goodness in terms of RI constant 62s- A is s-good if it has the RIP with 5 2s < V% — 1- 
This establishes a bound on the RI constant of A. 

Theorem 6.3. Let b = AW for some given s-rank matrix W. If 62s < V% — 1, then W = X* 
where X* is the unique optimal solution to NNM. 

Recht et al. [33] showed that if 5s s < 1/10, then X* = W where X* is the unique optimal 
solution to NNM. Lee and Bresler [21] gave 5s s < 1/(1 + 4/\/3) by employing an analogue of 
the approach for SSR [9j; Candes and Plan [10] gave 5i s < v2 — 1 based on the work [9l [T3] : 
Mohan and Fazel [29] gave 5 2s < 0.307, 5 3s < 2y/b - 4, and <5 4s < (8 - v / 40)/3 by combining a 
s, s'-restricted orthogonality constant property which extended the recent work in CS [5j[6l[7]- 
Meka, Jain and Dhillon [27] gave 82s < 1/3 via singular value projection (SVP), though the 
efficient SVP algorithm requires a priori knowledge of the rank of W . Oymak, Mohan, Fazel 
and Hassibi [31] proposed a general technique for translating results from SSR to LMR, where 
they give the current best bound on the restricted isometry constant 62s < 0.472. Our results 
were independently obtained. 
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7. Conclusion 

In this paper, we studied the s-goodness characterization of the linear transformation in LMR. 
By employing the properties of G- numbers 7^ and we established necessary and sufficient 
conditions for a linear transformation to be s-good, and provided sufficient conditions for exact 
and stable LMR via NNM under mild assumptions. Furthermore, we obtained computable upper 
bounds of G- number which lead to verifiable sufficient conditions for exact LMR. 



Acknowledgments The work was supported in part by the National Natural Science Founda- 
tion of China (10831006) and the National Basic Research Program of China (2010CB732501), 
and a Discovery Grant from NSERC. 

References 

B. Ames and S.A. Vavasis, Nuclear norm minimization for the planted clique and biclique problems, submitted 
to Math. Program., (2009) 

C. Beck and R. D'Andrea, Computational study and comparisons of LFT reducibility methods, in Proceedings 
of the American Control Conference, Philadelphia, Pennsylvania, June (1998) 

J.F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems, Springer, New York, 2000. 
J.-F. Cai, E. J. Candes, and Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM 
J. Optim. 20(4), pp. 1956-1982 (2010) 

T. T. Cai, L. Wang and G. Xu, Shifting inequality and recovery of sparse signals, IEEE Trans. Inf. Theory, 
58(3), pp. 1300-1308 (2010) 

T. T. Cai, L. Wang and G. Xu, New bounds for restricted isometry constants. IEEE Trans. Inf. Theory, 
56(9), pp. 4388-4394 (2010) 

T. T. Cai, G. Xu, and J. Zhang, On recovery of sparse signals via h minimization, IEEE Trans. Inf. Theory, 
55(7), pp. 3388-3397 (2009) 

E. J. Candes, Compressive sampling, in: International Congress of Mathematicians. Vol. Ill, pp. 1433-1452 
(2006) 

E.J. Candes, The restricted isometry property and its implications for compressed sensing. Academie des 
Sciences, 2008. 

E. J. Candes, and Y. Plan, Tight oracle bounds for low-rank matrix recovery from a minimal number of 
random measurements. In Press, IEEE Trans. Inf. Theory, (2009) 

E. J. Candes, and B. Recht, Exact matrix completion via convex optimization, Foundations of Computational 
Math. 9, pp. 717-772 (2009) 

E. J. Candes, J. Romberg, and T. Tao, Robust uncertainty principles: exact signal reconstruction from highly 
incomplete frequency information, IEEE Trans. Inform. Theory, 52(2), pp. 489-509 (2006) 
E. J. Candes and T. Tao, Decoding by linear programming, IEEE Trans. Inf. Theory, 51(12), pp. 4203-4215 
(2005) 

C. Ding, D. Sun, and K.-C Toh, An introduction to a class of matrix cone programming, Tech. Rep. (2010) 

D. L. Donoho, Compressed sensing. IEEE Trans. Inform. Theory, 52(4), pp. 1289-1306 (2006) 
A. d'Aspremont, L. El Ghaoui, Testing the nullspace property using semidefinite programming, Tech. Rep. 
(2008) 

M. Fazel, H. Hindi, and S. Boyd, A rank minimization heuristic with application to minimum order system 
approximation. In Proceedings American Control Conference, 2001. 

A. Juditsky, and A. S. Nemirovski, On verifiable sufficient conditions for sparse signal recovery via l\ mini- 
mization, Math. Program., 127(1), pp. 57-88 (2011) 

A. Juditsky, F. Karzan and A. S. Nemirovski, Verifiable conditions of i'l-recovery of sparse signals with sign 
restrictions, Math. Program., 127(1), pp. 89-122 (2011) 

A. Juditsky, F. Karzan and A. S. Nemirovski, Accuracy guarantees for ^i-recovery, arXiv 2010 
K. Lee and Y. Bresler, Guaranteed minimum rank approximation from linear observations by nuclear norm 
minimization with an ellipsoidal constraint. Available online at http://arxiv.org/abs/0903.4742 Submitted 
on 27 Mar 2009. 

A. Lewis and H. Sendov, Nonsmooth Analysis of Singular Values. Part II: Applications. Set- Valued Anal. 
13(3), pp. 243-264 (2005) 

Z. Lin, M. Chen, L. Wu, and Y. Ma, The Augmented Lagrange Multiplier Method for Exact Recovery of 
Corrupted Low-Rank Matrices, submitted to Mathematical Programming, October (2009) 
Y. Liu, D. Sun, and K.-C Toh, An implementable proximal point algorithmic framework for nuclear norm 
minimization, Math. Program., DOI: 10.1007/sl0107-010-0437-8 (2010) 



20 LINGCHEN KONG, LEVENT TUNCEL, NAIHUA XIU 

[25] Z. Liu and L. Vandenberghe, Interior-point method for nuclear norm approximation with application to 

system identification. SIAM J. Matrix Anal. Appl., 31(3), pp. 1235-1256 (2009) 
[26] S. Ma, D. Goldfarb, and L. Chen, Fixed point and Bregman iterative methods for matrix rank minimization, 

Math. Program., 128, pp. 321-353 (2011) 
[27] R. Meka, P. Jain, and I.S. Dhillon, Guaranteed rank minimization via singular value projection. Available at 

|http://arxiv.org/abs/0909.5457| Submitted on 30 Sep, 2009. 
[28] M. Mesbahi and G. P. Papavassilopoulos, On the rank minimization problem over a positive semidefinite 

linear matrix inequality, IEEE Transactions on Automatic Control, 42(2), pp. 239-243 (1997) 
[29] K. Mohan, M. Fazel, New restricted isometry results for noisy low-rank matrix recovery, Proc. Intl. Symp. 

Info. Thoery (ISIT), Austin, TX, June 2010. 
[30] von Neumann, J.: Some matrix-inequalities and metrization of matric-space, Tomsk Uni- versify Review 1, 

pp. 286-300 (1937) In: Collected Works, Pergamon, Oxford, 1962, Volume IV, 205-218. 
[31] S. Oymak, K. Mohan, M. Fazel and B. Hassibi, A simplified approach to recovery conditions for low rank 

matrices, 2011. 

[32] Y. Peng, A. Ganesh, J. Wright, W. Xu, and Y. Ma, RASL: Robust Alignment by Sparse and Low-rank 

Decomposition for Linearly Correlated Images, Submitted to IEEE Transactions on Pattern Analysis and 

Machine Intelligence (PAMI), July (2010) 
[33] B. Recht, M. Fazel, and P. Parrilo, Guaranteed minimum rank solutions of matrix equations via nuclear 

norm minimization, SIAM Review, 52(3), pp. 471-501 (2010) 
[34] B. Recht, W. Xu, and B. Hassibi, Null space conditions and threshlods for rank minimization, Math. Program. 

B 127 , pp. 175-202 (2011) 

[35] B. Recht, W. Xu, and B. Hassibi, Necessary and sufficient conditions for success of the nuclear norm heuristic 
for rank minimization, Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, 
Dec. (2008) 

[36] R.T. Rockafellar, R.J.-B. Wets. Variational Analysis. Second Edition. Springer, New York, 2004 

[37] M. Tao, and X.M. Yuan, Recovering low-rank and sparse components of matrices from incomplete and noisy 

observations, SIAM Journal on Optimization, 21 (1), pp. 57-81 (2011) 
[38] G. A. Watson, Characterization of the subdifferential of some matrix norms. Linear Algebra and Applications, 

170, pp. 1039-1053 (1992) 

[39] Y. Zhang, A simple proof for recoverability of ^i-minimization: go over or under?, manuscript, (2005). 



